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Abstract 

A  method  for  estimating  parameters  in  dynamic  stochastic  (Markov  Chain)  models  based  on 
Kurtz’s  limit  theory  coupled  with  inverse  problem  methods  developed  for  deterministic  dynam¬ 
ical  systems  is  proposed  and  illustrated  in  the  context  of  disease  dynamics.  This  methodology 
relies  on  finding  an  approximate  large-population  behavior  of  an  appropriate  scaled  stochastic 
system.  This  approach  leads  to  a  deterministic  approximation  obtained  as  solutions  of  rate 
equations  (ordinary  differential  equations)  in  terms  of  the  large  sample  size  average  over  sample 
paths  or  trajectories  (limits  of  pure  jump  Markov  processes).  Using  the  resulting  deterministic 
model  we  select  parameter  subset  combinations  that  can  be  estimated  using  an  ordinary-least- 
squares  (OLS)  or  generalized-least-squares  (GLS)  inverse  problem  formulation  with  a  given  data 
set.  The  selection  is  based  on  two  criteria  of  the  sensitivity  matrix:  the  degree  of  sensitivity 
measure  in  the  form  of  its  condition  number  and  the  degree  of  uncertainty  measured  in  the 
form  of  its  parameter  selection  score.  We  illustrate  the  ideas  with  a  stochastic  model  for  the 
transmission  of  vancomycin-resistant  enterococcus  (VRE)  in  hospitals  and  VRE  surveillance 
data  from  an  oncology  unit. 
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selection,  large  population  sample  path  approximations. 
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1  Introduction 


Closely  tied  to  the  formulation  of  the  mathematical  models  is  the  need  to  estimate  the  parameters 
(including  initial  conditions)  involved  as  well  as  provide  uncertainty  bounds  for  the  estimates. 
Validating  the  mathematical  models  with  empirical  data  for  the  system  under  investigation  is  a 
key  step  in  gaining  insight  into  the  system  process  and  evaluating  the  effectiveness  of  particular 
control  strategies  [8,  10,  11,  12,  21,  24].  A  number  of  advanced  mathematical  and  statistical 
tools  for  parameter  estimation  in  deterministic  dynamic  models  are  readily  available.  The  key 
objective  of  this  paper  is  to  present  a  methodology  to  estimate  parameters  in  a  stochastic  model 
using  inverse  problem  methods  developed  for  deterministic  dynamical  systems.  In  these  inverse 
problem  methods,  parameter  estimates  along  with  uncertainty  bounds  (confidence  intervals)  are 
readily  obtained  from  longitudinal  data  for  a  single  realization  of  the  observation  process  for  the 
stochastic  system.  Moreover,  sensitivity  analysis  along  with  parameter  selection  (determining  which 
parameters  are  most  “identifiable”  with  the  given  data)  can  be  done  without  massive  simulation 
studies. 

It  is  difficult  to  carry  out  the  above  estimation  related  tasks  directly  with  stochastic  models 
and  limited  data.  The  methodology  presented  in  this  paper  is  based  on  using  an  approximate 
large-population  behavior  of  an  appropriate  scaled  stochastic  system  using  Kurtz’s  limit  theory 
[19,  22],  By  scaling  the  stochastic  system  and  applying  the  Strong  Law  of  Large  Numbers  (SLLN) 
for  the  relevant  Poisson  process,  we  can  derive  the  corresponding  deterministic  approximation 
as  solutions  of  rate  equations  in  terms  of  the  large  sample  size  average  over  sample  paths  or 
trajectories.  Using  the  resulting  deterministic  model  we  select  parameter  subset  combinations  that 
can  be  estimated  using  an  ordinary-least-squares  (OLS)  or  generalized-least-squares  (GLS)  inverse 
problem  formulation  with  a  given  data  set  along  with  an  appropriate  statistical  model  for  the 
observation  process. 

Given  an  experimental  data  set,  a  mathematical  model  may  be  more  sensitive  to  some  param¬ 
eters  than  others,  and  the  dependence  between  the  parameters  can  impact  the  well-posedness  of 
an  inverse  problem.  Therefore,  it  is  of  interest  to  limit  the  attempted  estimation  to  subsets  of 
parameters  for  which  the  mathematical  model  is  most  sensitive.  The  analysis  used  to  select  the 
type  of  inverse  problem  formulation  and  the  subset  of  parameters  to  be  estimated  from  a  given 
data  set  is  based  on  previous  ideas  in  the  literature  [6,  14,  17],  and  is  reviewed  in  Section  4.  The 
selection  procedure  is  based  on  two  criteria  of  the  sensitivity  matrix:  the  degree  of  sensitivity 
measure  in  the  form  of  its  condition  number  and  the  degree  of  uncertainty  measured  in  the  form 
of  its  parameter  selection  score.  The  idea  is  to  select  first  all  parameter  combinations  with  a  full 
rank  sensitivity  matrix  and  then  calculate  the  corresponding  Fisher  matrix  condition  numbers  and 
selection  scores.  Then  parameter  subset  combinations  with  small  condition  numbers  and  selection 
scores  are  considered  as  feasible  (i.e.,  can  be  estimated  with  reasonable  levels  of  uncertainty). 

The  motivation  for  this  manuscript  derives  from  our  interest  in  understanding  the  spread 
of  infectious  diseases  in  particular,  nosocomial  infections,  in  order  to  prevent  major  outbreaks 
in  hospital  settings.  Thus  in  Section  2  we  introduce  a  stochastic  model  of  the  transmission  of 
Vancomycin-Resistant  Enterococcus  (VRE)  in  hospitals  that  is  used  to  illustrate  the  methodology 
introduced  in  this  paper.  This  model  was  developed  in  our  initial  studies  of  VRE  (see  [25]  as  well  as 
[2,  3,  4,  7,  10,  15,  16,  23,  24,  28,  29]).  In  Section  2.2  we  show  in  detail  how  to  obtain  the  correspond¬ 
ing  deterministic  approximation  using  Kurtz’s  theory.  We  follow  in  Section  3  with  a  description  of 
the  surveillance  data  motivating  our  efforts  and  the  parameters  that  can  be  estimated  directly  from 
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the  data.  In  Section  4  we  review  the  inverse  problem  and  parameter  selection  methodology  used 
to  estimate  parameters  and  quantify  uncertainty  for  problems  with  deterministic  systems.  Finally, 
in  Sections  5  and  6  we  present  some  illustrative  results  and  along  with  some  summary  conclusions. 

2  A  Motivating  VRE  Stochastic  Model 

For  our  motivating  example  model,  patients  in  a  hospital  unit  are  classified  by  compartments 
or  states  as  either  uncolonized  U(t),  VRE  colonized  C(t),  or  VRE  colonized  in  isolation  J(t),  as 
depicted  in  the  compartmental  schematic  of  Figure  1.  A  description  of  the  variables  and  parameters 
are  given  in  Table  1.  Patients  are  admitted  to  the  hospital  unit  at  a  rate  A  per  day  and  some 
fraction  m  are  already  VRE  colonized.  The  transition  from  one  compartment  to  another  follows 
an  exponential  distribution  and  the  expected  mean  duration  within  a  compartment  is  given  by  the 
inverse  of  the  parameter  of  the  exponential  distribution.  A  hand-hygiene  policy  applied  to  health 
care  workers  on  isolated  VRE  colonized  patients  reduces  infectivity  by  a  factor  of  7  (0  <  7  < 
1).  It  is  assumed  VRE  colonization  periods  last  from  weeks  to  months  and  because  spontaneous 
decolonization  occurs  infrequently,  clearance  of  the  bacteria  is  not  considered  in  the  model.  VRE 
colonized  patients  are  moved  into  isolation  at  a  rate  a. 


Community 


Figure  1:  Compartmental  VRE  model 


2.1  The  VRE  stochastic  model 

The  dynamics  of  the  VRE  colonization  of  patients  in  a  hospital  unit  are  modeled  as  a  continu¬ 
ous  time  Markov  Chain  (MC)  with  discrete  state  space  [1,  5,  26]  embedded  in  M3.  In  this  case, 
the  population  of  patients  is  considered  discrete  (i.e.,  VRE  colonization  occurs  in  units  of  whole 
individuals)  and  the  timing  of  events  is  a  probabilistic  process.  The  state  of  the  MC  at  time  t  is 
denoted  by  {U(t)  =  i,C(t)  =  j,J(t )  =  k},t  >  0  and  i,j,k  6  {0, 1,...IV}.  The  probability  that 
during  a  small  time  interval,  dt,  of  transiting  from  one  state  to  another  is  described  by 

P{U(t  +  dt)  =  i  —  1,  C(t+  dt)  =  j  +  1,  J(t+  dt)  =  k\U(t)  =  i,  C(t)  =  j,  J(t)  =  k} 

=  mfjLiU  dt+  /3U[C  +  (1  —  7  )J]dt+  o{dt),  (1) 
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Table  1:  Model  Parameters 


Variables 

Description 

Units 

u(t) 

Number  of  uncolonized  patients 

Individuals 

C(t) 

Number  of  VRE  colonized  patients 

Individuals 

At) 

Number  of  VRE  colonized  patients  in  isolation 

Individuals 

Parameters 

Description 

Units 

A 

Patients  admission  rate 

Individuals  /  day 

m 

VRE  colonized  patients  on  admission  rate 

Dimensionless 

P 

Effective  contact  rate 

1/  day 

7 

HCW  hand  hygiene  compliance  rate 

Dimensionless 

a 

Patient  Isolation  rate 

1/  day 

+ 1 

Uncolonized  patients  discharged  rate 

1/  day 

M 2 

VRE  colonized  patients  discharged  rate 

1/  day 

P{U(t  +  dt )  =  i,  C(t  +  dt)  =  j  +  1,  J(t  +  dt)  =  k  —  1| U(t)  =  i,  C(t)  =  j,  J(t)  =  k} 

=  mn2Jdt+  o(dt),  (2) 

P{U(t  +  dt)  =i+l,  C(t+  dt)  =  j  —  1,  J(t+  dt)  =  k\U(t)  =  i,  C(t)  =  j,  J(t)  =  k} 

=  (1  —  m)n2Cdt+  o(dt),  (3) 

P{U(t+  dt)  =  i  +  1,  C(t+  dt)  =  j,  J(t+  dt)  =  k  —  1| U(t)  =  i,  C(t)  =  j,  J(t)  =  k} 

=  (1  —  m)ii2Jdt+  o(dt),  (4) 

P{U(t+  dt)  =  i,  C(t+  dt)  =  j  —  1,  J(t  +  dt)  =  k  +  1| U(t)  =  i,  C(t)  =  j,  J(t)  =  k} 

=  aCdt+o(dt),  (5) 

P{U ( t  +  dt)  =  i,C(t+  dt)  =  j,  J(t  +  dt)  =  k\U(t)  =  i,  C(t)  =  j,  J(t)  =  k} 

=  (1  —  m)niU  dt  +  mn2Cdt+  [1  —  (A  +  /3U[C  +  (1  —  7)  J]  +  aC)]dt+  o(dt).  (6) 


In  the  VRE  epidemic  model  a  constant  population  is  assumed  in  which  the  hospital  remains  full 
for  all  t  (i.e. ,  overall  admission  rate  equals  overall  discharge  rate,  A  =  fi\U  +  ^2 (C  +  J)).  Hence, 
the  admission  of  a  patient  in  either  compartments  U  or  C  are  dependent  events  on  the  discharged 
in  either  compartment  U  or  C  or  J  (or  vice  versa).  We  assume  that  when  a  patient  is  discharged 
from  the  hospital,  he/she  is  immediately  replaced  by  an  admission  into  the  compartment  U  with 
probability  (1  —  m)  or  into  the  compartment  C  with  probability  m.  Equation  (1)  is  the  probability 
of  entering  compartment  C  by  either  an  admission  (due  to  a  discharge  in  compartment  U)  or  by 
effective  colonization.  Equation  (2)  is  the  probability  of  entering  compartment  C  by  an  admission 
due  to  a  discharge  in  J.  Equation  (3)  is  the  probability  of  admission  to  compartment  U  by  a 
discharge  in  C .  Equation  (4)  is  the  probability  of  admission  into  compartment  U  by  a  discharge 
in  J.  Equation  (5)  is  the  probability  of  moving  a  VRE  colonized  patient  into  isolation.  Finally, 
Equation  (6)  is  the  probability  that  none  of  the  states  changes  due  to:  an  uncolonized  patient 
being  discharged  and  replaced  back  into  the  U  compartment,  or  a  VRE  colonized  patient  in  C 
being  discharged  and  replaced  back  into  the  C  compartment,  or  no  event  occurs. 
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Table  2:  Transition  rates 


Event 

Effect 

Transition  rate 

Discharge  of  uncolonized  patient 

(U,C,J)=(i-l,j,k) 

Ai  =  HiU 

Discharge  of  VRE  colonized  patient 

(U,C,J)=(i,j-l,k) 

A2  = 

Discharge  of  VRE  colonized  patient 
in  isolation 

(U,C,J)=(i,j,k-l) 

A3  =  H2  J 

Patient  colonization  due  to  VRE 
colonized  patients 

(U,C,J)=(i-l,j+l,k) 

a4  =  p  UC 

Patient  colonization  due  to  VRE 
colonized  patients  in  isolation 

(U,C,J)=(i-l,j+l,k) 

a5  =  P{i  —  l)U  J 

Isolation  of  VRE  colonized  patient 

(U,C,J)=(i,j-l,k+l)) 

A6  =  a(J 

Admission  of  uncolonized  patient 

U=i+1 

(1  -  m)(Ai  +  A2  +  A3) 

Admission  of  VRE  colonized  patient 

C=j+1 

m(Ai  +  A2  +  A3) 

2.2  The  deterministic  approximation 

When  dividing  equations  (l)-(6)  by  dtt  and  taking  the  limit  when  dt  tends  to  0+,  we  obtain  the 
rates  of  transitions  as  summarized  in  Table  2.  In  the  stochastic  model,  the  rates  represent  the  mean 
number  of  transitions  that  can  be  expected  in  a  given  period,  with  the  actual  numbers  distributed 
about  these  means.  Hence,  the  rates  determine  the  frequencies  of  occurrence  through  time  for  the 
transitions  or  events. 

Letting  Ri(t)  for  i  =  1,  ...,6,  be  the  number  of  times  the  ith  transition  has  occurred  by  time  t. 
Then,  the  state  of  the  system  at  time  t  can  be  written  as 

U(t)  =  U(0)-R1(t)-R4(t)-R5{t)  +  (l-m)(R1(t)  +  R2(t)  +  R3(t)) 

C(t )  =  (7(0)  —  R2(t)  +  R4{t)  +  R§(t)  —  Re(t)  +  m(Ri(t)  +  R2{t)  +  R3(t )) 

J{t)  =  J(0)-R3(t)  +  R6(t),  (7) 

where  Ri(t)  is  a  counting  process  with  intensity  A (7(t),  J(t ))  given  by 

Ri(t)=Yi  (J*  A*(C/(s),  (7(s),  J(s))d-s^j  ,i  =  1,...,6,  (8) 

with  Yi  as  independent  unit  Poisson  processes.  Note  that  the  state  of  the  system  is  ( U(s ),  C(s),  J(s )) 
and  hence  each  A*(C/(s),  (7(s),  J(s))  is  constant  between  transition  times.  Also,  note  that  sample 
paths  ri(t)  of  Ri(t)  are  given  in  terms  of  sample  paths  (u(t) ,  c(t) ,  j (t))  of  (U (t) ,  C (t) ,  J (t))  by 

n(t)  =  YifJ  Xi(u(s),c(s),j(s))ds\  ,i  =  1,...,6.  (9) 

Let  UN(t )  =  U(t)/N,  CN(t )  =  C(t)/N,  JN(t)  =  J(t)/N  be  the  patient  units  per  system  size  or 
the  proportion  in  the  stochastic  process.  The  corresponding  sample  paths  are  (uN (t),  cN (t),  jN (f)). 
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We  express  the  rates  A*  for  i  =  1, 6  in  terms  of  these  scaled  variables  as  follows: 

Ai  =  fi\u{t)  =  N  fiiuN  (t),  A4  =  (3u(t)c(t)  =  N2  (3uN  (t)cN  (t), 

A2  =  li2c{t)  =  N/i2cN(t),  A5  =  /3(1  -  7 )u(t)j(t)  =  N2/3(  1  -  ry)uN (t)jN (t), 

A3  =  1^2  j(t)  =  N/j,2jN(t),  A6  =  ac(f)  =  NacN(t). 

Using  these  rates  we  can  obtain  an  approximation  for  rf  (t,).  the  averages  of  the  77(f)  of  (9)  by  apply¬ 
ing  the  SLLN  for  the  Poisson  Process  (i.e.,  Y(Nfi)/N  ~  //).  The  process  results  in  approximating 
for  large  N .  the  sample  paths  (uN (t),  cN (f),  jN (t))  by  a  large  sample  size  average  approximation 
over  paths  (u(t) ,  c(i) ,  j (t))  defined  by  a  deterministic  system.  That  is,  we  approximate  integrals 
in  the  averaged  sample  path  equations  by  integrals  that  are  used  as  the  defining  equations  for  the 
deterministic  sample  paths.  This  is  done  by  the  approximations 


r?W  =  :w-  =  TfYi(l 

=  j^Yi  (N  jo  Mi uN(s)ds)  ~  ^Y,  (nJ  hi u(s)ds^j 

Hi  u(s)ds.  (10) 

The  approximations  for  r^(t)  for  i  =  2,  ...,6,  can  be  obtained  similarly.  When  dividing  both  sides 
of  the  sample  path  analogue  of  each  equation  in  (7)  by  N  and  applying  the  approximations  for 
rf(t),  we  can  write  the  system  of  integral  equations  (i.e.,  rate  equations)  that  approximate  the 
stochastic  equations  (7)  via  the  SLLN.  The  rate  approximation  equations  are  given  by 


u"  (t)  =  u"  (0)  -  rf  (t)  -  rf  (t)  -  rf  (t)  +  (1  -  m)(rf  (t)  +  rf  (t)  +  rf  (t)) 

«  -u(O)  —  f  Hi u(s)ds  —  f  N /3u(s)c(s)ds  —  f  N/3(  1  —  7 )u(s)c(s)ds 
Jo  Jo  Jo 

+  /  (1  -  m)(Hiu(s)  +  H2(c(s)  +  j(s)))ds 

Jo 

cN(t)  =  CN(0)  -  7(()  +  r f(t)  +  r "(f)  -  r6"(i)  +  (t)  +  r2w(t)  +  r^(()) 

ps  c(0)  —  /  H2c(s)ds  +  j  N f3u(s)c(s)ds  +  /  iV/3(l  —  ry)u(s)j(s)ds 

Jo  Jo  Jo 

-  [  ac(s)d.s  +  f  m(Hiu(s)  +  H2(c(s)  +  j(s)))ds 


r{t)=r(0)-rH(t)  +  rZ{t) 

~j( 0)  -  [  H2j(s)ds+  f  ac(s)ds. 


(11) 


Upon  approximating  (uN (t),cN (t),jN (t))  in  the  left  side  by  (u(t) ,  c(t) ,  j (t))  and  differentiating 
the  above  equations  we  obtain  the  defining  deterministic  ordinary  differential  equations  for  (u,  c,j) 
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given  by 


=  - niu(t )  -  j3Nu(t)c(t)  -  /3N(  1  -  7 )u(t)j(t)  +  (1  -  m)(mu(t)  +  fi2(c(t)  +  j(t))) 

— ^  =  ~H2c{t)  +  /3Nu(t)c(t)  +  j3N(l  -  7 )u(t)j(t)  -  ac(t )  +  m(n\u(t)  +  //2(c(i)  +  j(t))) 
at 

=  ~^j(t)  +  ac(t),  (12) 

with  initial  conditions  u(0)  =  Uq/N,  c(0)  =  Cq/N ,  and  j(0)  =  Jo/ N. 

To  facilitate  comparison  with  the  MC  model,  we  let  U(t)  =  Nu(t),  C(t )  =  Nc(t),  and  J(t)  = 
Nj(t).  Then,  the  system  of  ordinary  differential  equations  which  provides  an  approximation  to 
averages  over  sample  paths  of  C(t),  J(t ))  is  described  by 

-  =  (1  —  fn)ln\U{t)  +  n2(C(t)  +  J(t))\  —  (3U(t)[C(t)  +  (1  —  ^)J(t)\  -  hi U(t) 

^  =  m[HiU(t)  +  H2(C(t)  +  +  l3U(t)[C(t )  +  (1  -  7) At)}  ~  («  +  H2)C(t) 

dJ^-  =  aC(t )  -  H'2J \t),  (13) 

with  initial  conditions  U(0)  =  Uq,  C(0)  =  Co,  and  J(0)  =  Jq. 

2.3  Simulation  results 

We  carried  out  simulations  to  compare  the  results  of  the  stochastic  and  deterministic  models. 
The  deterministic  system  was  numerically  solved  using  ode 45  in  Matlab.  Both  deterministic  and 
stochastic  results  are  generated  using  the  same  parameter  values  and  initial  conditions.  We  used 
a  stochastic  simulation  algorithm  proposed  by  Gillespie  [20]  that  is  standard  for  discrete  state 
continuous  time  MC  models.  The  algorithm  is  the  following: 

1.  Initialize  the  state  of  the  system; 

2.  For  a  given  state  of  the  system  calculate  the  transition  rates,  A,,  for  i  =1 where  n  is  the 
total  types  of  transitions; 

3.  Calculate  the  sum  of  all  transition  rates  A  =  Y/'i= i 

4.  Monte  Carlo  Step:  Simulate  the  time  until  the  next  transition,  r,  by  drawing  from  an 
exponential  distribution  with  mean  1/A; 

5.  Monte  Carlo  Step:  Simulate  the  transition  type  by  drawing  from  the  discrete  distribution 
with  P (transition  =  i )  =  \/\.  Generate  an  uniformly  distributed  random  number  r2.  For 
0  <  r2  <  Ai/A  transition  1  is  chosen,  for  Ai/A  <  r2  <  (Ai  +  X2)/\  transition  2  is  chosen,  and 
so  on; 

6.  Update  the  new  time  t  =  t  +  r  and  the  new  system  state; 

7.  Iterate  steps  2-6  until  t  >  tstop- 
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Figure  2:  Sample  of  5  stochastic  realizations  in  comparison  to  the  numerical  solution  of  the  deter¬ 
ministic  model;  N  =  37  patients,  tstop  =  500. 


Figure  2  depicts  a  simulation  of  the  stochastic  model  for  a  sample  of  5  stochastic  realizations 
(N  =  37,  tstop  =  500)  plotted  in  comparison  to  the  deterministic  numerical  solution  for  the  three 
compartments.  Note  that  the  stochastic  realizations  exhibit  very  large  differences.  However,  when 
carrying  out  the  simulations  for  larger  values  of  N,  the  variation  between  the  stochastic  realizations 
decreases  as  the  value  of  N  increases  and  are  closer  to  the  numerical  solution  of  the  deterministic 
model,  as  seen  in  Figure  3.  To  quantitatively  analyze  how  the  variability  of  the  stochastic  real¬ 
izations  decreases  as  N  increases,  we  calculated  the  coefficient  of  variation  (CV)  in  the  range  t  = 
[300,  400]  using  100  stochastic  realizations.  These  are  given  in  the  caption  for  Figure  3. 

3  VRE  Surveillance  Data 

The  motivating  surveillance  data  is  from  an  oncology  unit,  obtained  from  the  VRE  Infection  Control 
database  of  the  Department  of  Quality  Improvement  Support  Service  of  Yale-New  Haven  Hospital. 
Data  reports  on  the  number  of  VRE  cases  occurred  on  admission  (including  patients  transferred), 
the  patients’  length  of  stay,  the  daily  number  of  patients  in  isolation  due  to  VRE  colonization, 
the  compliance  of  swab  culture  administered  on  admission,  and  the  health  care  worker  contacts 
precautions  compliance.  Data  collection  occurred  during  the  period  of  January  2005  to  January 
2007  with  a  mean  number  of  in-patients  per  day  of  31  patients  (with  a  total  of  37  beds). 

Ward  protocol  required  rectal  swabbing  all  patients  on  admission,  and  once  a  week  (every 
Tuesday)  for  VRE  surveillance.  Compliance  was  not  100%,  as  the  mean  percentage  of  swab  cultures 
taken  on  admitted  patients  per  day  was  77%.  Swab-test  results  were  usually  returned  48  hours 
after  admission.  If  a  patient  tested  VRE  positive,  he/she  was  isolated.  The  isolation  procedure 
consisted  of  contact  precautions  by  the  use  of  gowns,  gloves,  and  the  location  of  a  patient  in  a 
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Figure  3:  Sample  of  5  stochastic  realizations  for  each  compartment  in  comparison  to  the  numerical 
solution  of  the  deterministic  model  for  N  =  137, 537,  937,  2037,  tstop  =  500.  The  coefficient  of 
variation  (CV)  for  [U,  C,  J ]  in  the  range  t  =  [300, 400]  using  a  sample  of  100  stochastic  realizations 
is:  (a)[0.064, 0.060, 0.027],  (b)[0.036, 0.027, 0.008],  (c)[0.031, 0.021, 0.006],  (d)[0.029, 0.014, 0.004] 
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Table  3:  Parameters  and  initial  conditions  values  (some  values  are  assumed  for  optimization  pur¬ 
poses) 


Initial  Conditions 

Oncology  Unit  (N=37) 

Source 

U(  0) 

29 

Assumed 

C(  0) 

4 

Assumed 

J(0) 

4 

data 

Parameters 

A 

l^iU(i)  +  /i2(C'(t)  +  <7(t)) 

- 

m 

0.04 

Assumed 

(3 

0.001 

Assumed 

7 

0.58 

data 

a 

0.29 

data 

hi 

0.16 

data 

h2 

0.08 

data 

single  room  or  in  a  room  with  another  patient  who  was  also  VRE  positive.  If  a  readmission  patient 
had  a  positive  VRE  culture  in  the  past,  he/she  did  not  get  the  rectal  swab  on  admission  but  was 
isolated  immediately.  The  isolation  report  was  performed  on  weekdays  (no  weekends  or  holidays). 
The  mean  number  of  isolated  VRE  colonized  patients  per  day  was  9.39  (std=2.90)  patients. 

3.1  Parameters  estimated  directly  form  the  surveillance  data 

Infection  control  measures  were  implemented  in  the  form  of  health  care  worker  hand- hygiene  before 
and  after  patients  contact  by  the  use  of  gloves  and  gowns,  and  washing  the  hands.  For  the4  present 
consideration  we  are  consider  the  health  care  worker  before  patient  contact  compliance  of  57.56% 
as  a  better  estimator  for  the  parameter  7.  In  the  oncology  unit  VRE  colonized  patients  had  a 
mean  length  of  stay  of  13.15  days  (std=18.28)  compared  with  6.27  (std=6.80)  for  the  uncolonized 
patients.  These  means  are  statistically  significantly  different  supporting  the  assumption  of  different 
discharge  rates.  Hence,  we  take  \f\i\  =  6.27  and  1///2  =  13.15  giving  =  0.16  and  n 2  =  0.08. 

In  an  attempt  to  estimate  the  fraction  m  of  patients  that  are  colonized  on  admission,  we  found 
inconsistencies  in  the  reported  prevalence  of  VRE  on  admission  (the  summaries  of  admitted  patients 
did  not  match  the  actual  data).  In  estimating  the  initial  conditions  (Uo,  Co,  Jo)  from  the  data 
reported  on  the  first  day  of  data  collection  (January  3,  2005),  only  the  number  of  VRE  colonized 
patients  in  isolation  were  reported.  Hence,  the  initial  conditions  for  Uo  and  Co  classes  cannot  be 
easily  estimated.  Another  parameter  that  is  of  interest  and  can  not  be  estimated  directly  from  the 
data  is  the  VRE  transmission  rate  (3.  As  a  result,  the  fraction  of  patients  that  are  colonized  on 
admission,  the  initial  conditions,  and  the  transmission  rate  will  be  estimated  using  inverse  problem 
methodology.  In  Table  3  we  present  the  estimated  and  assumed  values  of  the  parameters  and  initial 
conditions  taken  as  nominal  values  in  inverse  problem  calculations. 
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4  Inverse  Problem  Methodology 


We  outline  briefly  the  statistical  methodology  for  estimating  parameters  in  dynamical  systems  such 
as  (13)  using  observations  of  some  of  the  states.  More  details  can  be  found  in  [6,  17]. 

Let  Yj  for  j  =  1  be  longitudinal  data  observations  (which  are  random  variables)  corre¬ 

sponding  to  the  experimental  data  for  the  observation  process.  Since  in  general  Yj  is  not  assumed 
to  be  free  of  error  (i.e.,  error  in  the  data  collection  process),  Yj  will  not  be  exactly  f(tj,0o ),  the 
observed  part  of  the  true  trajectory  at  time  tj.  The  statistical  model  that  captures  the  variability 
is  assumed  given  by 


Yj  =  f(tj,  90)  +  Sj  for  j  =  1, ..,  n,  (14) 

in  the  case  of  absolute  error  in  the  measurements.  We  can  thus  envision  experimental  data  as 
generally  consisting  of  observations  from  a  “perfect”  model  plus  an  error  component,  where  0q 
corresponds  to  the  “true”  parameter  that  generates  the  observations  Yj  for  j  =  1,  We  assume 
that  the  £j’s  are  generated  from  a  generally  unknown  probability  distribution  P.  They  are  assumed 
to  satisfy  the  error  assumptions 

(EA)  The  random  variables  £j,j  =  l....,n,  are  independent  identically  distributed  with  mean 
zero  (i.e.,  E(£j)  =  0)  and  constant  finite  variance  (i.e.,  var(£j)  =  <7q  <  oo). 

The  observational  process  corresponding  to  the  mathematical  model  (13)  is  denoted  by 

f(tj,0o)  =  J(tj,0o).  (15) 

where  the  observation  function  f(tj,0 )  depends  on  the  parameters  0  in  a  nonlinear  fashion. 


4.1  Ordinary  least  squares  (OLS)  estimation 

If  the  error  distribution  is  unknown,  an  OLS  optimization  procedure  is  often  employed.  This  method 
can  be  viewed  as  minimizing  the  distance  between  the  data  and  the  model  where  all  observations 
are  treated  as  of  equal  importance.  The  OLS  method  defines  “best”  as  when  the  norm  square  of 
the  residuals  is  a  minimum 

n 

9ols  =  9ols  =  arg  minee©  ~  /(*j>  e)f-  (16) 

3  = 1 


This  corresponds  to  solving  for  0  in 


J2[yj-f(tj,0)]vf(tj,0)  =  o. 

3  = 1 


We  do  not  know  the  distribution  of  the  random  variable  0ols,  but  under  asymptotic  theory  [6,  17, 
27]  we  have  as  n  — >  oo  the  approximation 


0OLS  =  9 ols  ~  K(0o,  so)’ 


(17) 
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where  the  covariance  matrix  Eg  is  defined  by 


so=°o[nfio]  1 

with 

Q0  =  limn^oo-xn(0o)Tx"(6lo)- 

n 

Here  xn{@)  =  {Xjk}  is  the  sensitivity  matrix  given  by 

XjkiO)  =  -  j  =  l,-,n  and  k  =  l,...,p. 

The  error  variance  Oq  is  approximated  by 

i  n 

°OLS  =  — ~  ~  Qols)}2  (18) 

n  p  j=i 

as  the  bias  adjusted  estimate  for  erg,  where  6ols  is  the  realization  of  Ools  for  a  given  realization 
{yj}  of  {Yj}.  The  covariance  matrix  Eg  is  approximated  by 

^ols  =  vols[xT  (@ols)x{8ols)]  1-  (19) 

Therefore  in  practice  one  uses  the  approximation 

Ools  ~  Np{0 o,  Eg)  «  Mp{9ols ,  YqLS).  (20) 

Asymptotic  standard  errors  for  the  parameter  estimates  are  obtained  by  taking  square  roots  of 

the  diagonal  elements  of  S qLS,  i.e.,  SE(Ok)  =  \J (E QLS)kki  k  =  1  The  sensitivity  matrix 

can  be  calculated  by  solving  the  sensitivity  equations 

d_  dx  =  dg_  dx  dg_ 
dt  d9  dx  dO  dO  ’ 

where  in  our  example  (13),  written  as  x  =  g(x(t),0),  dg/dx  is  a  3  x  3  matrix  function  and  both 
dx/dO  and  dg/dO  are  3  x  p  matrix  functions. 

4.2  Generalized  least  squares  (GLS)  estimation 

If  the  error  distribution  is  unknown  and  we  suspect  that  relative  error  is  present  in  the  measurement, 
then  the  assumption  of  constant  variance  of  the  error  in  the  longitudinal  data  does  not  hold.  In 
such  cases,  a  generalized  least  square  (GLS)  optimization  procedure  should  be  employed.  For  this 
case  we  need  to  formulate  a  new  statistical  model  to  take  into  consideration  the  non-constant  error 
variability.  If  we  can  assume  that  the  size  of  the  error  depends  linearly  on  the  size  of  the  observed 
quantity,  the  statistical  model  (i.e,  relative  error  model)  is  given  by 

Yj  =  f{tj,0o)(l  +  £j)  for  j  =  1, ..,  n,  (22) 
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where  the  £j  satisfy  (EA).  It  follows  that  Yj  J\T(f(tj,9o),(jQf2(tj,9o)).  In  this  case,  GLS  can 
be  viewed  as  minimizing  the  distance  between  the  data  and  the  model  while  taking  into  account 
a  model  dependency  variance  in  the  observations.  The  GLS  method  defines  “best”  estimator  as 
9gls  obtained  from  solving 

n 

yj-2(tj,9GLs)[Yj  -  f(tj,9GLs)]Vf(tj,0GLS )  =0,  (23) 

3= 1 


with  the  corresponding  estimate  9gls  for  a  given  realization  {yj}-  From  asymptotic  theory  [6,  17] 
we  find 


where 

with 


Ogls  =  Oqls  ~  so) 


xiP) 


^[xT(0o)W(9o)x(9o)}-1 


r  df(tue)  df(t  1,0)  1 

901  "  '  dBp 


df(t„,0)  _  _  _  df{tn,0) 

90i  90p 


(24) 


and  IT  x(0)  =  diag(f2(ti,9), (f2(tn,9)).  Using  the  estimates  we  have  the  covariance  matrix 
approximation 

so  ~  Yqls  =  ^gls[xT(0gls)W (i 0gls)x{9gls)]~ 1  (25) 

and  the  error  variance  approximation 


aGLS  ~ 


n 

—  E 

—  T)  * 


n  P  f2(tj,0GLs) 


[Vj  ~  f(tj,0GLs)Y 


Therefore  in  practice  we  use  the  approximation 

9gls  ~  NP{0o,  £q)  ~  Np{9Gls,  Y'gls)- 


(26) 


(27) 


We  can  also  calculate  the  asymptotic  standard  errors  for  9qls  by  taking  the  square  roots  of  the 
diagonal  elements  of  the  covariance  matrix  YqLS.  Again  the  sensitivity  matrix  xi^GLs)  =  {Xjk} 
can  be  calculated  using  the  sensitivity  equations  in  (21). 

Typically,  one  does  not  attempt  to  solve  (23)  directly,  but  rather  the  estimate  9gls  for  a  given 
realization  {yj}  can  be  solved  for  iteratively  using  the  algorithm: 

1.  Set  k  =  0.  Estimate  the  initial  9q^s  by  using  the  OLS  estimate  with  yj  in  place  of  Yj  in  (16); 

2.  Form  the  weights  iUj  =  f~2(tj,  9glS)', 

3.  Find  9(yp  by  minimizing 


Jk{0cLs)  =  ywjlvj  -  f(tj,0GLs) I2; 

3= 1 


(28) 
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4.  Set  k  =  k  +  1  and  return  to  2.  Terminate  the  process  when  two  successive  estimates  for  Oqls 
are  “close”  to  one  another. 


4.3  Subset  selection  algorithm 

It  is  typical  that  in  systems  such  as  (13)  some  of  the  parameters  (components  of  6)  are  more 
readily  estimated  than  others.  The  ability  to  reliably  estimate  a  parameter  is  directly  related  to  the 
sensitivity  of  the  model  output  to  a  parameter.  In  order  to  identify  the  subset  of  parameters  that  has 
a  high  sensitivity  to  the  mathematical  model,  we  use  the  identifiability  analysis  recently  developed 
in  [14].  The  parameter  selection  or  parameter  identifiability  algorithm  consists  of  considering  two 
criteria: 

1.  Select  the  combinations  of  parameter  vectors  that  have  a  full  rank  sensitivity  matrix  Xn(0). 
The  degree  of  sensitivity  for  the  matrix  is  measured  in  the  form  of  its  condition  number 
ft(xn(0))  defined  below  in  (36); 

2.  For  each  parameter  vector  selected  in  the  first  criteria,  estimate  its  degree  of  uncertainty.  Its 
degree  of  uncertainty  is  measured  in  the  form  of  the  parameter  selection  score  v{9)  defined 
by  (37). 

The  motivation  behind  the  first  criterion  is  as  follows.  If  9q  is  the  true  parameter,  then  A 9  = 
9  —  6*o  denotes  a  local  perturbation  from  9q.  This  gives  rise  to  a  local  perturbation  A y(t)  = 
y(t,  9)  —  y(t,  9q)  in  the  output  model.  Then  by  a  first  order  Taylor  approximation  we  obtain  the 
approximate  relationship 

Ay  «  XA 9.  (29) 

A  parameter  vector  is  identifiable  (locally)  if  the  above  equation  can  be  solved  uniquely  for  A 9. 
This  is  the  case  if  rank(x )  =  Ik  or  equivalently,  if  and  only  if  the  Fisher  information  matrix, 
F  =  XT(#)X(0)  is  nonsingular  or 

det(xTX)  +  0.  (30) 

The  Fisher  information  matrix  measures  the  amount  of  information  that  an  observation  process 
carries  about  an  unknown  parameter  9.  If  near-singularity  of  F  is  present  then  the  approximation  of 
the  covariance  matrix  and  consequently  the  calculation  of  standard  errors  and  confidence  intervals 
for  the  corresponding  estimated  parameters  are  affected. 

If  one  focuses  on  properties  of  the  sensitivity  matrix  X(0)  rather  than  the  Fisher  information 
matrix,  a  singular  value  decomposition  (SVD)  of  the  sensitivity  matrix  plays  a  crucial  role  in 
uncertainty  quantification.  The  SVD  of  the  sensitivity  matrix  is  denoted  by 


x(0)  =  u 


A 

0 


(31) 


where  U  =  \U\  U2]  and  V  are  n  x  n  and  p  x  p  orthogonal  matrixes,  with  U±  containing  the  first 
p  columns  of  U  and  U2  containing  the  last  n  —  p  columns.  A  is  a  p  x  p  diagonal  matrix  defined  as 
A  =  diag(s  1, ...,  sp )  with  s\  >  S2  >  ...  >  sp  >  0,  and  0  denotes  an  (n  —  p)  x  p  matrix  of  zeros. 

Suppose  that  f(t,  9)  is  well  approximated  for  all  t  =  tj  by  its  linear  Taylor  expansion  around  9q 
as 

f(t,e)nf(t,6o)  +  %(t,0o)(6-eo).  (32) 
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Then  letting  f{9)  =  (f{ti,0),...,f(tn,0))T,Y  =  (Yi,...,Yn)T  and  £  =  (£1, Sn)T,  we  have  from 
(14) 

Y-m  =  -X(e  o)(9-90)+£.  (33) 

We  can  then  define  the  estimator  6ols  that  minimizes  \Y  —  f(9)\2  and  using  the  invariance  property 
of  the  Euclidean  norm  (i.e. ,  |ie|2  =  wTw  =  w1  Iw  =  wTUUTw  =  \UTw\2)  we  have  that 

\Y-f(9)\2  =  \-x(90)(9-90)+£\2 

=  UT(-U  q  vT(9-90)  +  £^j 

=  -  0  Vt(6-90)+  £  ■  (34) 

Solving  |  —  A Vt{9  —  9q)  +  Uf£ |2  =  0  for  6  we  have 

6-90  =  (A  Vt)~1U?£. 

This  implies 

Ools  =  90  +  VK-xUj£ 

p  i 

=  90 +  ^2—Viuf£.  (35) 

i=l  Si 

Note  that  if  st  — >  0,  the  estimator  is  particular  sensitive  to  £. 

If  x(9)  £  Mnxp  is  a  full  rank  sensitivity  matrix  (i.e.,  rank(x(9 ))  =  p)  its  condition  number  k  is 
defined  as  the  ratio  of  the  largest  to  smallest  singular  value  given  by 

4xm  =  (36) 

Sp 

which  provides  a  degree  of  singularity  due  to  perturbations  and  hence  a  criteria  for  parameter 
identifiability.  If  the  columns  of  x(0)  are  nearly  dependent  then  (36)  is  large. 

Motivation  of  the  second  criteria  is  the  uncertainty  in  the  parameters  of  a  particular  subset 
combination  that  can  be  quantified  using  the  standard  errors  SE{9).  In  order  to  compare  the 
degree  of  variation  from  one  parameter  to  another,  the  coefficient  of  variation  CV  =  SE{9)/9  £ 
MP  is  used.  The  CV  allows  one  to  compare  the  parameters  even  if  the  parameter  estimates  are 
substantially  different  in  units  and  scales.  Hence,  a  second  criteria  can  be  established  by  considering 
the  parameter  selection  score 

v(9)  =  \CV{9)\.  (37) 

In  (37)  a  value  near  zero  indicates  lower  uncertainty  possibilities  in  the  estimation  while  large  values 
suggest  a  possibility  of  a  wide  uncertainty  in  at  least  some  of  the  estimates. 

In  general,  the  algorithm  that  searches  all  possible  parameter  combinations  and  selects  the  ones 
satisfying  criteria  1.  and  2.  is  the  following: 


15 


1.  Combinatorial  search.  For  a  fixed  p.  1  <  p  <  K  (where  K  is  total  number  of  parame¬ 
ters  and  initial  conditions  that  are  candidates  for  estimation- for  our  problem  here  K  =  8), 
calculate  the  set 

S.p  =  {6  =  (qi,  ...,qp)  G  Rp\qk  G  Qk%  Qk  /  qi  for  all  k,l  =  1,  ...,p} 

where  Q#  =  {a,  7,  //1,  //2,  Jo>  Co,  m,  /?}  and  the  set  collects  all  the  parameter  vectors 
obtained  from  the  combinatorial  search; 

2.  Full  rank  test.  Calculate  the  set  of  feasible  parameters  &p  as 

Qp  =  {0\6  G  Sp  C  Mp,  rank(x(Q)  =  p)}-  Calculate  the  condition  number  defined  by 

«(x(0))  =  7-; 

Sp 

3.  Standard  error  test.  For  every  6  G  0p,  calculate  a  vector  of  coefficients  of  variation 
CV(9)  G  W  by 

.ot/-  _  V 
CVi  ~  %  ’ 

for  i  =  1,  ...,p  and  =  (Tq[x{9)T x(^)]-1  G  Mpxp.  Calculate  the  parameter  selection  score 

as  v(0)  =  \CV(9)\. 

5  Inverse  Problem  Results 

5.1  Optimization  algorithm  testing  with  synthetic  data 

Before  illustrating  with  the  VRE  surveillance  data,  we  test  and  illustrate  use  of  the  optimization 
algorithm  to  investigate  the  convergence  of  the  parameters  estimates  6  to  the  known  values  9q.  In 
order  to  do  this,  we  construct  a  synthetic  dataset  { yj }  for  j  =  1,  ...,n,  by  adding  variability  to  the 
corresponding  model  solution  component  f(tj,9o )  =  J(tj,9o)  in  (13).  The  statistical  model  that 
captures  the  variability  is  taken  as 


yj  =  f(tj,90)  +  azj  (38) 

where  Zj  is  a  realization  from  a  standard  normal  variable  (i.e.,  Zj  ~  jV(0, 1))  and  a  is  the  constant 
variability.  The  magnitude  of  a  determines  how  much  noise  is  added.  A  low  value  indicates  that 
the  data  points  tend  to  be  very  close  to  the  same  value  (the  mean),  while  high  values  indicates 
that  the  data  are  “spread  out”  over  a  large  range  of  values.  Therefore,  we  can  expect  that  95%  of 
the  time,  numbers  generated  from  this  distribution  will  fall  in  the  interval  [—1.96<r,  1.96a].  To  this 
end,  we  consider  the  standard  error  as  one  indication  of  the  ability  of  the  algorithm  to  estimate 
the  parameters  using  the  synthetic  data  set. 

The  OLS  and  GLS  optimization  were  solved  with  MATLAB  routine  Isqnonlin  for  n=500. 
Parameter  upper  bounds  are  taken  as 


(a,7>m,/k2,</o,Cb,ra,/3) 


(0.5, 1,1, 1,1V,  N,  1,1) 


and  lower  bounds  are  set  to  zero.  Note  that  the  upper  bound  for  a  is  0.5  because  the  method 
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for  VRE  detection  is  assumed  to  take  at  least  2  days.  The  model  solutions  f(tj,8o )  =  J(tj,8o ) 
are  generated  with  initial  conditions  and  parameter  values  do  for  the  oncology  unit  as  described  in 
Table  3  (which  are  assumed  to  be  the  true  values).  By  introducing  variability  levels  such  as  a  =  0, 
<7  =  0.01,  a  =  0.05,  and  a  =  0.1  in  the  model  solutions  the  reliability  of  the  algorithm  and  hence 
that  of  estimates  are  explored.  Note  that  even  though  we  are  adding  constant  variability  to  the 
synthetic  data,  the  GLS  optimization  algorithm  is  tested  with  this  data.  This  is  because  we  wish 
to  investigate  how  the  noise  affects  the  standard  deviation  and  not  how  meaningful  they  are. 

In  Tables  4,  5,  and  6  we  summarize  the  results  for  the  inverse  problems  for  9  =  (Jo,Cq,  P), 

8  =  ( Jo,  Co,  m,  P),  and  9  =  (a,  Jo,Co,m,  P)  using  an  OLS  and  a  GLS  optimization  formulation. 
Results  indicates  that  both  algorithms  appear  to  be  reliable  for  the  estimation  of  the  parameters 
since  the  estimated  values  are  close  to  their  true  values.  Note  that  as  a  increases  the  corresponding 
standard  errors  increase.  This  indicates  that  the  reliability  of  both  algorithms  in  estimating  the 
parameters  may  depend  on  the  observational  error  in  the  data.  Similar  results  were  obtained  for 
the  other  types  of  inverse  problem  formulations. 

5.2  Subset  selection  results  using  the  oncology  unit  surveillance  data 

To  carry  out  the  subset  selection  algorithm  with  the  oncology  unit  surveillance  data  we  assumed 
nominal  parameter  values  described  in  Table  3.  Since  we  are  interested  in  estimating  the  initial 
conditions,  transmission  rate,  and  the  fraction  of  patients  that  are  already  colonized  on  admission, 
when  p  =  4  the  only  parameter  combination  considered  is  that  of  9  =  (Jo,  Co,  m,  P).  When 
p  =  1,2,3  the  only  parameters  considered  are  9  =  (/?),  9  =  (m,  /?) ,  and  9  =  (Jo,  Co,  j3). 

In  Table  7  we  present  the  resulting  selection  score  v{9)  and  condition  number  k(x(9))  f°r  each 
subset  of  parameters  when  p  =  1 , . . . ,  8.  Values  that  fall  in  the  smallest  selection  score  with  the 
relative  small  condition  number  are  considered  the  most  feasible  subset  of  parameters.  Results  in¬ 
dicate  that  the  subsets  of  parameters  9  =  (Jo,  Co,m,  j3)  have  small  condition  numbers  and  relatively 
small  selection  scores  indicating  that  these  subsets  might  provide  low  uncertainty  in  the  parameter 
estimates.  In  Table  8  we  summarize  the  results  of  4  inverse  problems  corresponding  to  the  subsets 
with  the  lowest  selection  scores  and  small  condition  numbers.  These  subsets  of  parameters  are: 

9  =  (7,  J0,  C0,m,  P) 

9  =  (J0,Co,m,P) 

8  =  (J0,C0,P ) 

9  =  (m,P). 

We  analyze  the  results  using  the  coefficient  of  variation  (CV)  by  considering  the  effect  that 
the  inclusion  or  exclusion  of  parameters  has  on  the  vector  9  =  (Jo,  Co,  m,  ft).  In  this  subset,  the 
standard  errors  for  Jo  is  about  0.4%  of  the  estimate,  for  Co  it  is  about  0.8%  of  the  estimate,  for 
m  it  is  about  1.6%  of  the  estimate,  and  for  p  it  is  0.3%  of  the  estimate.  When  including  7  (i.e., 

9  =  (7,  Jo,  Co,  m,  /?)),  the  CV  increases  for  almost  all  parameters.  On  the  other  hand,  when  m 
is  dropped  or  when  the  initial  conditions  are  dropped,  there  is  a  reduction  in  the  CV.  Since  this 
reduction  is  not  significant,  we  can  conclude  that  the  subset  9  =  ( Jo,  Co,  m,  ft)  is  a  good  choice  to 
be  estimated  from  the  oncology  surveillance  data  since  it  provides  estimates  with  low  uncertainty. 

Comparison  of  residual  plots  (for  details  on  the  use  of  residual  plots  in  such  problems,  see  [6]) 
for  all  subsets  of  parameters  combinations  suggested  that  the  assumptions  of  the  relative  error 
statistical  model  (22)  corresponding  to  the  GLS  procedure  are  most  suitable.  In  particular,  the 
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Table  4:  OLS  and  GLS  optimization  algorithm  testing  for  6  =  (Jo.  Co,  (3)  using  synthetic  data.  The 
model  was  fit  to  the  synthetic  data  with  levels  of  noise:  a  =  0,0.01,0.05,  and  0.1.  Subscripts  in  9a 
denote  the  level  of  noise  in  the  synthetic  data. 


Jq 

P 

e°LS 

4.000e+00 

4.000e+00 

1.000e-03 

SE(9°ls) 

2.301e-13 

3.097e-13 

1.691e-17 

& 

4.007e+00 

3.998e+00 

1.006e-03 

SE(9  OLS) 

2.162e-03 

2.907e-03 

1.584e-07 

& 

4.022e+00 

3.995e+00 

1.032e-03 

SE0  0°o¥) 

1.077e-02 

1.444e-02 

7.793e-07 

4.074e+00 

3.954e+00 

1.0636-03 

SE(0°js) 

2.222e-02 

2.971e-02 

1.5856-06 

e$LS 

4.000e+00 

4.000e+00 

1.000e-03 

SE(9qLS) 

3.956e-15 

5.346e-15 

4.358e-19 

aGLS 

f70.01 

4.002e+00 

4.000e+00 

1.0066-03 

se(& 

4.150e-05 

5.604e-05 

4.553e-09 

nGLS 

f7o.o5 

4.040e+00 

3.973e+00 

1.032e-03 

se(&) 

2.1126-04 

2.847e-04 

2.290e-08 

e^s 

4.016e+00 

4.015e+00 

1.0676-03 

SE(9qiS) 

4.119e-04 

5.523e-04 

4.343e-08 
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Table  5:  OLS  and  GLS  optimization  algorithm  testing  for  0  =  (Jo,Co,m,  j3)  using  synthetic  data. 
The  model  was  fit  to  the  synthetic  data  with  levels  of  noise:  a  =  0,0.01,0.05,  and  0.1.  Subscripts 
in  9a  denote  the  level  of  noise  in  the  synthetic  data. 


Jo 

Go 

m 

P 

True  9 

4 

4 

0.04 

0.001 

Initial  9 

3 

5 

0.05 

0.002 

9°LS 

4.000e+00 

4.000e+00 

4.000e-02 

1.000e-03 

SE(9qLS) 

8.620e-12 

1.557e-ll 

1.611e-13 

1.352e-14 

& 

4.004e+00 

4.008e+00 

4.013e-02 

9.955e-04 

SE(9  °tf) 

2.372e-03 

4.287e-03 

4.457e-05 

3.735e-06 

& 

4.029e+00 

3.992e+00 

4.032e-02 

1.004e-03 

SE(9°  %£) 

1.102e-02 

1.990e-02 

2.091e-04 

1.741e-05 

4.074e+00 

3.945e+00 

3.987e-02 

1.074e-03 

SE(9qiS) 

2.271e-02 

4.067e-02 

4.273e-04 

3.529e-05 

9$LS 

4.000e+00 

4.000e+00 

4.000e-02 

1.000e-03 

SE(9qLS) 

1.496e-13 

2.696e-13 

3.145e-15 

2.626e-16 

& 

4.003e+00 

4.001e+00 

4.003e-02 

1.004e-03 

SE(&) 

4.636e-05 

8.350e-05 

9.762e-07 

8.137e-08 

& 

4.009e+00 

4.013e+00 

4.022e-02 

1.014e-03 

SEffiM) 

2.343e-04 

4.214e-04 

4.971e-06 

4.118e-07 

0^s 

4.050e+00 

4.011e+00 

4.046e-02 

1.025e-03 

SE(9qiS) 

4.434e-04 

7.976e-04 

9.545e-06 

7.845e-07 
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Table  6:  OLS  and  GLS  optimization  algorithm  testing  for  6  =  (a,  Jo,  Co,  m,  (5)  using  synthetic  data. 
The  model  was  fit  to  the  synthetic  data  with  levels  of  noise:  a  =  0,0.01,0.05,  and  0.1.  Subscripts 
in  6a  denote  the  level  of  noise  in  the  synthetic  data. 


a 

Jq 

m 

P 

9°LS 

2.890e-01 

4.003e+00 

4.136e+00 

4.451e-02 

1.856e-03 

se{9°ls ) 

2.145e-04 

5.126e-04 

1.674e-03 

2.009e-05 

1.220e-06 

& 

2.895e-01 

4.010e+00 

4.120e+00 

4.454e-02 

1.872e-03 

SE{9  0%s) 

1.094e-03 

2.620e-03 

8.517e-03 

1.025e-04 

6.264e-06 

& 

2.811e-01 

4.023e+00 

4.269e+00 

5.080e-02 

1.670e-03 

SE(& 

5.755e-03 

1.337e-02 

4.696e-02 

6.381e-04 

3.168e-05 

o^s 

2.899e-01 

4.051e+00 

4.109e+00 

4.541e-02 

1.880e-03 

SE(6°ts) 

1.071e-02 

2.572e-02 

8.329e-02 

1.033e-03 

6.263e-05 

e$LS 

2.901e-01 

4.000e+00 

4.095e+00 

4.325e-02 

1.538e-03 

SE(9qLS) 

3.606e-06 

8.920e-06 

2.854e-05 

3.686e-07 

2.277e-08 

nGLS 

u0.01 

2.894e-01 

4.009e+00 

4.130e+00 

4.300e-02 

1.869e-03 

SE(9  0GoY) 

2.282e-05 

5.581e-05 

1.836e-04 

2.373e-06 

1.525e-07 

nGLS 

^o.os 

2.787e-01 

4.030e+00 

4.348e+00 

2.360e-02 

1.860e-03 

SEffiM) 

1.068e-04 

2.581e-04 

8.901e-04 

5.955e-06 

5.713e-07 

o$ks 

2.900e-01 

4.035e+00 

4.189e+00 

4.739e-02 

1.882e-03 

SE(9qiS) 

2.147e-04 

5.191e-04 

1.740e-03 

2.506e-05 

1.502e-06 
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Table  7:  Subset  parameter  selected  as  a  result  of  the  selection  algorithm  for  p  =  1, 8  using  the 
oncology  unit  surveillance  data  with  nominal  parameter  values  described  in  Table  3  using  the  GLS 
optimization. 


Parameter  vector  q 

Selection  score  u(g) 

Condition  number  n(\(q)) 

(0) 

1.9756-05 

1.000e+00 

(to,  p) 

2.358e-03 

8.070e+02 

(Jo,  Co,  ft) 

7.134e-03 

8.236e+04 

(Jo,  Co,  in,  (3) 

1.815e-02 

9.946e+04 

(7  ,Jo,Co,m,P) 

1.539e-01 

2.253e+05 

(a,  Jo,  Co,  to,  p) 

1.597e-01 

1.063e+06 

(Ml,  do,  Co,  TO,  P) 

1.715e+01 

1.308e+08 

(M2,  do,  Co,  TO,  p) 

6.123e+03 

3.695e+05 

(a,  n\,  Jo,  Co,  to,  /3) 

6.225e+00 

5.522e+06 

(7,  Mi,  do,  Co,m,p) 

1.741e+01 

1.127e+08 

(a,  a<2,  do,  Co,  to,  /3) 

6.315e+01 

2.453e+05 

(7,M2  ,do,  Co,m,p) 

8.472e+02 

7.112e+05 

(a,  7,  Jo,  Co,  to,  /3) 

2.297e+03 

2.852e+06 

(AH,  M2,  do,  Co,m,P) 

7.475e+04 

2.091e+05 

(a,  7,  /Lti ,  Jo,  Co,  to,  /3) 

8.413e+02 

2.074e+09 

(a,H!,H2,Jo,  Co,m,p) 

1.929e+03 

3.760e+05 

(a,  7,  a«2  ,  Jo,Co,m,  p) 

3.447e+04 

4.305e+06 

(7i  Mi)  M2>  do,  Co,  to,  d) 

4.589e+04 

4.311e+07 

(a,  7,  Mi)  AH,  Jo,  Co,  to, /3) 

1.469e+04 

1.967e+09 
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Table  8:  Results  of  4  inverse  formulations  solved  with  nominal  values  in  Table  3  via  GLS  optimiza¬ 
tion  for  the  oncology  unit  surveillance  data. 


7 

Jo 

G0 

m 

P 

6 

6.392e-01 

4.004e+00 

1.092e+00 

5.277e-02 

4.770e-03 

SE{9) 

2.680e-02 

1.811e-02 

4.985e-02 

6.007e-03 

3.955e-04 

CV(0) 

4.192e-02 

4.524e-03 

4.567e-02 

1.139e-01 

8.291e-02 

e 

- 

3.706e+00 

1.966e+00 

3.608e-02 

4.865e-03 

se{6) 

- 

1.499e-02 

1.560e-02 

5.616e-04 

1.675e-05 

CV0) 

- 

4.044e-03 

7.934e-03 

1.556e-02 

3.443e-03 

e 

- 

3.706e+00 

1.966e+00 

- 

4.865e-03 

SE(9) 

- 

1.419e-02 

1.184e-02 

- 

9.945e-08 

CV(0) 

- 

3.829e-03 

6.020e-03 

- 

2.044e-05 

9 

- 

- 

- 

4.070e-02 

4.725e-03 

SE{9) 

- 

- 

- 

9.290e-05 

2.802e-06 

CV(0) 

- 

- 

- 

2.282e-03 

5.931e-04 

residual  analysis  for  estimating  6  =  ( Jo,  Co,  m,  (3)  using  OLS  is  presented  in  Figure  4.  The  OLS 
residual  plots  (a)  and  (b)  in  Figure  4  reveal  a  fan  shaped  error  structure  which  indicates  the 
nonconstant  variance  assumption  is  suspect.  When  GLS  optimization  is  used  instead,  the  residual 
plots  (c)  and  (d)  in  Figure  4  (note  the  difference  in  scales  on  the  vertical  axes)  reveals  a  more 
random  error  structure,  suggesting  that  the  GLS  procedure  was  correctly  used.  Finally,  a  best  fit 
of  the  model  solution  to  the  oncology  surveillance  data  is  plotted  in  Figure  5.  We  note  that  this  is 
not  a  particularly  encouraging  fit  of  model  to  data,  suggesting  perhaps  unaccounted  for  modeling 
error  which  is  addressed  in  [25]  and  a  forthcoming  manuscript. 
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(a)  OLS:  Residuals  vs  Time 
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(c)  GLS:  Residuals/Model  vs  Time 


(b)  OLS:  Residuals  vs  Model 
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* 
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(d)  GLS:  Residuals/Model  vs  Model 


Figure  4:  Residual  analysis  for 
oncology  unit  surveillance  data. 


the  OLS  and  GLS  optimization  for  6  =  (Jo,Co,m,  f3)  using 
Note  the  difference  in  scales  of  axis  in  (a),(b)  versus  (c),(d). 
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*  *  *  * 
vb  vjb  vlb 

4v  4b  4b 

^  ^ 

vlb  vtiMbvb  vb  vb  vb  vb  vb  vl^.Xbvb  vb  vbbvb  vjifrb  vltob  vlbvb 

4b  4IMM'  4^  yF^F  'mtWS  4^  xF  ^HN'i'  'ms  ^N’F 

vbvbbx  vb  vbvb  vlxHUxlxvbUx  Ub  vb  Jx  v^bx  -Ub  vb  ^Z.^A_  vb  vbx  vlx  vibvlb  vb  vb  'JUI^ib  _ 

xFxbTfv  yF  ~FF  'I'HIM'W  4b  4v4v  4Pfv"4'4'  4v  4v"xF  4v4b  4v  ~F^F~7F  7F  4VMb 

vbvbvbvU/v'U^b  vbvb  vb  vb  \b  Mx  vlxlxb  *Mr  vlxlb  vMbvb  vMx  vldxUyb  vb  \b  vibvlxb  vb  vl^Mx  vb  vb  vbvb  vlAlxvb 

xMvxFflMMv  -T-T'  -Tv  7F  4v  -Tfv  4*FF  4F  /MS  xflNxb  ^TVV 'TiMVV'  "7F  -T*  'ffv-lxN  A-  -TbWv  TF  "^v  4FF  xNMv 
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Figure  5:  Best  fit  model  solutions  to  oncology  unit  surveillance  data  via  GLS  optimization, 
(Jo,  Co,  fa  J)  =  (4,2,0.04,0.0049). 


6  Concluding  Remarks 

Over  the  past  decade  efforts  to  connect  models  to  data  in  the  context  of  disease  dynamics  have 
grown  dramatically  albeit  most  of  the  efforts  have  been  carried  out  in  the  context  of  deterministic 
epidemic  models  [13].  However,  not  only  is  it  the  case  that  the  use  of  stochastic  Markov  Chain 
models  is  often  most  appropriate,  but  also  the  use  of  stochastic  processes  in  epidemiology  has  had 
a  long  and  distinguished  history  going  back  to  1766  [18]. 

The  introduction  of  a  methodology  for  parameter  estimation  within  the  context  of  a  typical 
MC  stochastic  model  through  the  use  of  a  limit  theory  due  to  Kurtz  provides  a  simple  rigorous 
approximation  approach  for  solution  to  an  inverse  problem  of  interest  to  epidemiologists.  We  have 
illustrated  this  approach  with  an  example  of  nosocomial  infections  in  hospital  occupancy  units. 
Since  the  number  of  beds  in  a  typical  hospital  unit  is  small  it  is  natural  to  consider  an  integer 
valued  stochastic  model.  Estimating  epidemiological  parameters  in  such  a  stochastic  model  can  be 
a  difficult  task,  particularly  when  the  data  is  quite  limited.  The  alternative  approach  involving  the 
estimation  of  parameters  from  a  corresponding  deterministic  approximation  to  the  MC  stochastic 
model,  based  on  large  sample  size  averages  over  sample  paths,  provides  a  reasonable  first  step. 
Once  a  deterministic  approximation  is  obtained,  one  can  readily  apply  standard  as  well  as  recently 
developed  parameter  estimation  methods  for  deterministic  systems  which  provide  not  only  the 
parameter  estimates  but  also  corresponding  measures  of  uncertainty  for  the  estimates. 


24 


7  Acknowledgments 


This  research  was  supported  in  part  by  the  Alfred  P.  Sloan  Foundation  Graduate  Scholarship 
(sponsored  by  the  Sloan  Foundation  and  NACME),  the  Louis  Stokes  Alliance  for  Minority  Par¬ 
ticipation  (LSAMP)  Fellowship  (sponsored  by  NSF/WAESO),  and  the  More  Graduate  Education 
at  Mountain  State  Alliance  (MGE@MSA)  Scholarship.  It  was  also  supported  in  part  by  the  U.S. 
Air  Force  Office  of  Scientific  Research  under  grant  AFOSR-FA9550-09- 1-0226  and  in  part  by  the 
National  Institute  of  Allergy  and  Infectious  Disease  under  grant  NIAID  9R01AI071915-05.  The 
authors  are  grateful  to  Dr.  Carlos  Torres-Viera  for  providing  the  sample  data  used  to  illustrate  the 
methodology. 

References 

[1]  L.  Allen,  An  Introduction  to  Stochastic  Processes  with  Applications  to  Biology ,  Pearson 
Education-Prentice  Hall,  New  Jersey,  2003. 

[2]  D.  J  .  Austin,  M.  J  .  M.  Bonten,  R.  A.  Weinstein,  S.  Slaughter,  and  R.  M.  Anderson, 
Vancomycin-resistant  enterococci  in  intensive-care  hospital  settings:  transmission  dynamics, 
persistence,  and  the  impact  of  infection  control  programs,  Proc.  Natl.  Acad.  Sci.  USA,  96 
(1999),  6908-6913. 

[3]  D.  J  .  Austin,  and  R.  M.  Anderson,  Transmission  dynamic  of  epidemic  methicillin-resistant 
staphylococcus  aureus  and  vancomycin-resistant  enterococci  in  England  and  Wales,  The  Jour¬ 
nal  of  Infectious  Diseases,  179  (1999),  883-891. 

[4]  D.  J  .  Austin,  M.  Kakehashi,  and  R.  M.  Anderson,  The  transmission  dynamic  of  antibiotic- 
resistant  bacteria:  the  relationship  between  resistance  in  commensal  organisms  and  antibiotic 
consumption,  Proc.  Biol.  Sci.,  264  (1997),  1629-1638. 

[5]  P.  Bai,  H.  T.  Banks,  S.  Dediu,  A.  Y.  Govan,  M.  Last,  A.  Lloyd,  H.  K.  Nguyen,  M.  S.  Olufsen,  G. 
Rempala,  and  B.  D.  Slenning,  Stochastic  and  deterministic  models  for  agricultural  production 
networks,  Mathematical  Biosciences  and  Engineering,  4  (2007),  373-402. 

[6]  H.  T.  Banks,  M.  Davidian,  J.R.  Samuels  Jr.,  and  K.  L.  Sutton,  An  inverse  problem  statisti¬ 
cal  methodology  summary,  Center  for  Research  in  Scientic  Computatione  Technical  Report, 
CRSC-TR08-01,  2008,  North  Carolina  State  University;  in  Mathematical  Statistical  Estimation 
Approaches  in  Epidemiology,  G.  Chowell  et  al.  (eds),  Springer,  New  York,  2009,  249-302. 

[7]  M.  J.  M.  Bonten,  R.  Willems,  and  R.  A.  Weinstein,  Vancomycin-Resistant  Enterococci:  why 
are  they  here,  and  where  do  they  come  from?,  The  Lancet  Infectious  Diseases,  1(5)  (2001), 
314-325. 

[8]  F.  Brauer  and  C.  Castillo-Chavez,  Mathematical  Models  in  Population  Biology  and  Epidemi¬ 
ology,  Texts  in  Applied  Mathematics,  Volume  40,  Springer- Verlag,  New  York,  2001. 

[9]  F.  Brauer,  P.  van  den  Driessche,  J.  Wu  and  L.  Allen,  Mathematical  Epidemiology,  Springer- 
Verlag,  New  York,  2008. 


25 


[10]  K.  E.  Byers,  A.  M.  Anglim,  C.  J.  Anneski,  and  B.  M.  Farr,  Duration  of  colonization  with 
vancomycin-resistant  enterococcus,  Infection  Control  and  Hospital  Epidemiology,  27  (2002), 
271-278. 

[11]  C.  Castillo-Chavez,  K.  C.  Chow,  and  X.  Wang,  A  mathematical  model  of  nosocomial  infection 
and  antibiotic  resistance:  evaluating  the  efficacy  of  antimicrobial  cycling  programs  and  patient 
isolation  on  dual  resistance,  Mathematical  and  Theoretical  Biology  Institute  Technical  Reports, 
MTBI-04-05M,  Arizonia  State  University,  2007. 

[12]  G.  Chowell,  P.  W.  Fenimore,  M.  A.  Castillo-Garsow,  and  C.  Castillo-Chavez,  SARS  outbreaks 
in  Ontario,  Hong  Kong  and  Singapore:  the  role  of  diagnosis  and  isolation  as  a  control  mecha¬ 
nism,  J.  Theor.  Biol. ,  224  (2003),  1-8. 

[13]  G.  Chowell,  J.M.  Hyman,  L.M.A.  Bettencourt,  and  C.  Castillo-Chavez  (Eds),  Mathematical 
and  Statistical  Estimation  Approaches  in  Epidemiology ,  Springer,  New  York,  2009. 

[14]  A.  Cintron- Arias,  H.T.  Banks,  A.  Capaldi,  and  A.L.  Lloyd,  A  sensitivity  matrix  based  method¬ 
ology  for  inverse  problem  formulation,  J.  of  Inverse  and  Ill-posed  Problems,  15  (2009),  545-564. 

[15]  B.  S.  Cooper,  G.  F.  Medley,  and  G.  M.  Scott,  Preliminary  analysis  of  the  transmission  dynamics 
of  nosocomial  infections:  stochastic  and  management  effects,  J.  of  Hospital  Infection,  43 
(1999),  131-147. 

[16]  E.  M.  C.  D’  Agata  ,  M.  A.  Horn,  and  G.  F.  Webb,  The  impact  of  persistent  gastrointestinal 
colonization  on  the  transmission  dynamics  of  vancomycin-resitant  enterococci,  The  Journal  of 
Infectious  Diseases,  185  (2002),  766-773. 

[17]  M.  Davidian  and  D.  M.  Giltinan,  Nonlinear  models  for  Repeated  Measurement  Data,  Mono¬ 
graphs  on  Statistics  and  Applied  Probability,  Vol.  62.  Chapman  and  Hall/CRC  Press,  Boca 
Raton,  FL,  1995. 

[18]  Klaus  Dietz  and  J.A.P.  Heesterbeek,  Daniel  Bernoullis  epidemiological  model  revisited,  Math. 
Biosci.,  180  (2002)  1-21. 

[19]  S.  N.  Ethier  and  T.  G.  Kurtz,  Markov  Processes:  Characterization  and  Convergence,  J.  Wiley 
and  Sons,  New  York,  1986. 

[20]  D.  T.  Gillespie,  A  General  method  for  numerically  simulating  the  stochastic  time  evolution  of 
coupled  chemical  reactions,  J.  Computational  Physics,  22  (1976),  403-434. 

[21]  B.  Grenfell  and  A.  Dobson,  Ecology  of  Infectious  Diseases  in  Natural  Populations ,  Cambridge 
University  Press,  Cambridge,  1995. 

[22]  T.G.  Kurtz,  Solutions  of  ordinary  differential  equations  as  limits  of  pure  jump  Markov  pro¬ 
cesses,  J.  Applied  Probability,  7  (1970),  49-58. 

[23]  M.  Lipsitch,  C.  T.  Bergstrom,  and  B.  R.  Levin,  The  epidemiology  of  antibiotic  resistance  in 
hospital  paradoxes  and  prescirptions,  Proc.  Natl.  Acad.  Sci.  USA,  97  (2000),  1938-1943. 


26 


[24]  M.  A.  Montecalvo,  W.  R.  Jarvis,  J.  Uman,  D.  K.  Shay,  C.  Petrullo,  K.  Rodney,  C.  Gedris,  H.  W. 
Horowitz,  and  G.  P.  Wormser,  Infection-control  measures  reduce  transmission  of  vancomycin- 
resistant  enterococci  in  an  endemic  setting,  Annals  of  Internal  Medicine ,  131  (1999),  269-272. 

[25]  A.  R.  Ortiz,  H.  T.  Banks,  C.  Castillo-Chavez,  G.  Chowell,  C.  Torres-Viera,  and  X.  Wang,  Mod¬ 
eling  the  transmission  of  Vancomycin-resistant  enterococcus  (VRE)  in  hospitals:  a  case  study, 
Center  for  Research  in  Scientic  Computation  Technical  Report,  CRSC-TR10-05,  February, 
2010,  North  Carolina  State  University,  Raleigh. 

[26]  E.  Renshaw,  Modeling  Biological  Populations  in  Space  and  Time ,  Cambridge  Studies  in  Math¬ 
ematical  Biology  (No.  11),  Cambridge,  1991. 

[27]  G.  A.  F.  Seber  and  C.  J.  Wild,  Nonlinear  Regression,  John  Wiley  &  Sons,  New  York,  1989. 

[28]  J.  Y.  Song,  H.  J.  Cheong,  Y.  M.  Jo,  W.  S.  Choi,  J.  Y.  Noh,  J.  Y.  Heo,  W.  J.  Kim,  Vancomycin- 
resistant  Enterococcus  colonization  before  admission  to  the  intensive  care  unit:  A  clinico- 
epidemiologic  analysis,  American  Journal  of  Infection  Control ,  37  (2009),  734-740. 

[29]  G.  F.  Webb,  E.  M.  C.  D’Agata,  P.  Magal,  S.  Ruan,  A  model  of  antibiotic-resistant  bacterial 
epidemics  in  hospitals,  Proc.  Natl.  Acad.  Sci.  USA,  102  (2005),  13343-13348. 


27 


